{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<center>\n",
    "    <p style=\"text-align:center\">\n",
    "        <img alt=\"phoenix logo\" src=\"https://storage.googleapis.com/arize-phoenix-assets/assets/phoenix-logo-light.svg\" width=\"200\"/>\n",
    "        <br>\n",
    "        <a href=\"https://arize.com/docs/phoenix/\">Docs</a>\n",
    "        |\n",
    "        <a href=\"https://github.com/Arize-ai/phoenix\">GitHub</a>\n",
    "        |\n",
    "        <a href=\"https://arize-ai.slack.com/join/shared_invite/zt-2w57bhem8-hq24MB6u7yE_ZF_ilOYSBw#/shared-invite/email\">Community</a>\n",
    "    </p>\n",
    "</center>\n",
    "<h1 align=\"center\">Tracing and Evaluating a LlamaIndex Application</h1>\n",
    "\n",
    "LlamaIndex provides high-level APIs that enable users to build powerful applications in a few lines of code. However, it can be challenging to understand what is going on under the hood and to pinpoint the cause of issues. Phoenix makes your LLM applications *observable* by visualizing the underlying structure of each call to your query engine and surfacing problematic `spans` of execution based on latency, token count, or other evaluation metrics.\n",
    "\n",
    "In this tutorial, you will:\n",
    "- Build a simple query engine using LlamaIndex that uses retrieval-augmented generation to answer questions over the Arize documentation,\n",
    "- Record trace data in [OpenInference tracing](https://github.com/Arize-ai/openinference) format using the global `arize_phoenix` handler\n",
    "- Inspect the traces and spans of your application to identify sources of latency and cost,\n",
    "- Export your trace data as a pandas dataframe and run an [LLM Evals](https://arize.com/docs/phoenix/concepts/llm-evals) to measure the precision@k of the query engine's retrieval step.\n",
    "\n",
    "ℹ️ This notebook requires an OpenAI API key."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 1. Install Dependencies and Import Libraries\n",
    "\n",
    "Install Phoenix, LlamaIndex, and OpenAI."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!pip install \"arize-phoenix[evals]\" \"openai>=1\" 'httpx<0.28' nest-asyncio \"openinference-instrumentation-llama-index>=2.0.0\" \"llama-index-core\" \"llama-index-llms-openai\" \"llama-index-embeddings-openai\""
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Import libraries."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import os\n",
    "from getpass import getpass\n",
    "\n",
    "import nest_asyncio\n",
    "import pandas as pd\n",
    "from llama_index.core import Document, Settings, VectorStoreIndex\n",
    "from llama_index.embeddings.openai import OpenAIEmbedding\n",
    "from llama_index.llms.openai import OpenAI\n",
    "from tqdm import tqdm\n",
    "\n",
    "nest_asyncio.apply()  # needed for concurrent evals in notebook environments\n",
    "pd.set_option(\"display.max_colwidth\", 1000)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 2. Set Up Phoenix \n",
    "\n",
    "You'll need Phoenix running to collect and visualize trace data from your LlamaIndex application. You have two options:\n",
    "\n",
    "### Option 1: Local Phoenix (Free)\n",
    "\n",
    "Run Phoenix locally on your machine. Install Phoenix and launch it in a separate terminal:\n",
    "\n",
    "```bash\n",
    "pip install arize-phoenix\n",
    "phoenix serve\n",
    "```\n",
    "\n",
    "This will start Phoenix at `http://localhost:6006`. You can view the Phoenix UI by opening this URL in your browser.\n",
    "\n",
    "### Option 2: Phoenix Cloud (Hosted)\n",
    "\n",
    "Use Phoenix Cloud for a hosted solution. Sign up for a free account at [Phoenix Cloud](https://app.phoenix.arize.com) and get your API key. Then set it as an environment variable:\n",
    "\n",
    "```bash\n",
    "export PHOENIX_COLLECTOR_ENDPOINT=your_endpoint_here\n",
    "export PHOENIX_API_KEY=your_api_key_here\n",
    "```\n",
    "\n",
    "The instrumentation code will automatically detect your API key and send traces to Phoenix Cloud.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 3. Configure Your OpenAI API Key\n",
    "\n",
    "Set your OpenAI API key if it is not already set as an environment variable."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "if not (openai_api_key := os.getenv(\"OPENAI_API_KEY\")):\n",
    "    openai_api_key = getpass(\"🔑 Enter your OpenAI API key: \")\n",
    "\n",
    "os.environ[\"OPENAI_API_KEY\"] = openai_api_key"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 4. Build Your LlamaIndex Application"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This example creates a simple `VectorStoreIndex` with documents about Arize and Phoenix for demonstration purposes. You can replace these with your own documents.\n",
    "\n",
    "Let's create a document collection and build our index:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Configure LlamaIndex settings\n",
    "Settings.llm = OpenAI(model=\"gpt-4o\")\n",
    "Settings.embed_model = OpenAIEmbedding(model=\"text-embedding-ada-002\")\n",
    "\n",
    "# Create a simple document collection about Arize and Phoenix for demo\n",
    "documents = [\n",
    "    Document(\n",
    "        text=\"Phoenix is Arize's open-source observability tool for LLM applications, providing tracing and evaluation capabilities. It helps developers understand LLM application performance and debug issues.\"\n",
    "    ),\n",
    "    Document(\n",
    "        text=\"LLM tracing with Phoenix helps you understand the performance and behavior of your language model applications. You can track token usage, latency, and see the complete flow of your RAG pipeline.\"\n",
    "    ),\n",
    "    Document(\n",
    "        text=\"Retrieval-augmented generation (RAG) combines information retrieval with language generation for better responses. It allows LLMs to access external knowledge beyond their training data.\"\n",
    "    ),\n",
    "    Document(\n",
    "        text=\"Model evaluation is crucial for understanding how well your ML models perform on real-world data. Phoenix provides LLM evals to assess quality, relevance, and hallucinations.\"\n",
    "    ),\n",
    "    Document(\n",
    "        text=\"Vector embeddings are numerical representations of text that capture semantic meaning. They enable similarity search and retrieval in RAG systems.\"\n",
    "    ),\n",
    "    Document(\n",
    "        text=\"OpenInference is an open standard for LLM application observability. It defines how to capture and store trace data from LLM applications.\"\n",
    "    ),\n",
    "    Document(\n",
    "        text=\"Span annotations in Phoenix allow you to add custom metadata and evaluations to your traces, helping you analyze and improve your LLM applications.\"\n",
    "    ),\n",
    "]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Enable Phoenix tracing via `LlamaIndexInstrumentor`. Phoenix uses OpenInference traces - an open-source standard for capturing and storing LLM application traces that enables LLM applications to seamlessly integrate with LLM observability solutions such as Phoenix."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from openinference.instrumentation.llama_index import LlamaIndexInstrumentor\n",
    "\n",
    "from phoenix.otel import register\n",
    "\n",
    "tracer_provider = register(project_name=\"llamaindex-tracing-tutorial\", protocol=\"http/protobuf\")\n",
    "LlamaIndexInstrumentor().instrument(\n",
    "    tracer_provider=tracer_provider,\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We are now ready to instantiate our query engine that will perform retrieval-augmented generation (RAG). Query engine is a generic interface in LlamaIndex that allows you to ask question over your data. A query engine takes in a natural language query, and returns a rich response. It is built on top of Retrievers. You can compose multiple query engines to achieve more advanced capability"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Create the vector index from our documents\n",
    "print(\"Creating vector index...\")\n",
    "index = VectorStoreIndex.from_documents(documents)\n",
    "\n",
    "# Create the query engine\n",
    "query_engine = index.as_query_engine()\n",
    "print(\"✅ Query engine ready!\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 5. Run Your Query Engine and View Your Traces in Phoenix\n",
    "\n",
    "Let's create some sample queries that relate to our document collection and test our query engine:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Sample queries that relate to our document collection\n",
    "queries = [\n",
    "    \"What is Arize and what does it help with?\",\n",
    "    \"How does Phoenix help with LLM observability?\",\n",
    "    \"What is retrieval-augmented generation?\",\n",
    "    \"How can I evaluate my AI Agents?\",\n",
    "    \"What are vector embeddings used for?\",\n",
    "    \"What is OpenInference?\",\n",
    "    \"How do span annotations work in Phoenix?\",\n",
    "    \"What kind of monitoring does Arize provide for AI Agents?\",\n",
    "]\n",
    "\n",
    "print(\"Sample queries:\")\n",
    "for i, query in enumerate(queries, 1):\n",
    "    print(f\"{i}. {query}\")\n",
    "print(f\"\\nTotal queries: {len(queries)}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's run these queries and view the traces in Phoenix."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "for query in tqdm(queries):\n",
    "    response = query_engine.query(query)\n",
    "    print(f\"Query: {query}\")\n",
    "    print(f\"Response: {response}\\n\")\n",
    "    print(\"-\" * 50)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "And just for fun, ask your own question!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "response = query_engine.query(\"What is Arize and how can it help me as an AI Engineer?\")\n",
    "print(response)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Check the Phoenix UI as your queries run. Your traces should appear in real time.\n",
    "\n",
    "<img src=\"https://storage.googleapis.com/arize-phoenix-assets/assets/docs/notebooks/llama-index-knowledge-base-tutorial/Screenshot%202025-09-08%20at%206.53.21%E2%80%AFPM.png\" alt=\"Trace Details View on Phoenix\" style=\"width:100%; height:auto;\">"
   ]
  }
 ],
 "metadata": {
  "language_info": {
   "name": "python"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
